A tutorial review: Metabolomics and partial least squares-discriminant analysis--a marriage of convenience or a shotgun wedding.

نویسندگان

  • Piotr S Gromski
  • Howbeer Muhamadali
  • David I Ellis
  • Yun Xu
  • Elon Correa
  • Michael L Turner
  • Royston Goodacre
چکیده

The predominance of partial least squares-discriminant analysis (PLS-DA) used to analyze metabolomics datasets (indeed, it is the most well-known tool to perform classification and regression in metabolomics), can be said to have led to the point that not all researchers are fully aware of alternative multivariate classification algorithms. This may in part be due to the widespread availability of PLS-DA in most of the well-known statistical software packages, where its implementation is very easy if the default settings are used. In addition, one of the perceived advantages of PLS-DA is that it has the ability to analyze highly collinear and noisy data. Furthermore, the calibration model is known to provide a variety of useful statistics, such as prediction accuracy as well as scores and loadings plots. However, this method may provide misleading results, largely due to a lack of suitable statistical validation, when used by non-experts who are not aware of its potential limitations when used in conjunction with metabolomics. This tutorial review aims to provide an introductory overview to several straightforward statistical methods such as principal component-discriminant function analysis (PC-DFA), support vector machines (SVM) and random forests (RF), which could very easily be used either to augment PLS or as alternative supervised learning methods to PLS-DA. These methods can be said to be particularly appropriate for the analysis of large, highly-complex data sets which are common output(s) in metabolomics studies where the numbers of variables often far exceed the number of samples. In addition, these alternative techniques may be useful tools for generating parsimonious models through feature selection and data reduction, as well as providing more propitious results. We sincerely hope that the general reader is left with little doubt that there are several promising and readily available alternatives to PLS-DA, to analyze large and highly complex data sets.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

H-NMR-Based Metabolomics Study of Cerebral Infarction

Background and Purpose—Stroke is one of the leading causes of adult disability and death in developing countries. However, early diagnosis is difficult and no reliable biomarker is currently available. Thus, we applied a H-NMR metabolomics approach to investigate the altered metabolic pattern in plasma and urine from patients with cerebral infarctions and sought to identify metabolic biomarkers...

متن کامل

Metabolomics Profile of Potato Tubers after Phosphite Treatment

Phosphite (Phi)-based fungicides are used to control the oomycete Phytophthora infestans which causes late blight disease, the most devastating disease in potatoes. In order to examine the effects of Phi-based fungicides on potato tubers through foliar or post-harvest application, a metabolite profiling approach based on gas chromatography coupled to mass spectrometry (GC-MS) has been establish...

متن کامل

Metabolic Signature of Remote Ischemic Preconditioning Involving a Cocktail of Amino Acids and Biogenic Amines

Methods and Results-—Rat plasma samples from RIPC and control groups were analyzed using a targeted metabolomic approach aimed at measuring 188 metabolites. Principal component analysis and orthogonal partial least-squares discriminant analysis were used to identify the metabolites that discriminated between groups. Plasma samples from 50 patients subjected to RIPC were secondarily explored to ...

متن کامل

1H NMR-based Plasma Metabolic Profiling of Dairy Cows with Type I and Type II Ketosis

This study identified differences in plasma metabolites among three groups of dairy cows: type I ketotic (K1), type II ketotic (K2), and healthy control cows (C). 50 cows with two or three parities were selected at 7–28 days postpartum. Cows were classified as type I ketotic (K1, 20 cows), type II ketotic (K2, 20 cows), or healthy control cows (C, 10 cows). Plasma metabolomic profiles were anal...

متن کامل

Simultaneous Spectrophotometric Determination of Iron, Cobalt and Copper by Partial Least-Squares Calibration Method in Micellar Medium

Iron, cobalt and copper are metals, which appear together in many real samples, both natural and artificial. Recently a classical univariate micellar colorimetric method has been developed for determination of these metal ions. The organized molecular assemblies such as micelles are used in spectroscopic measurements due to their possible effects on the systems of interest. The ability of mi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Analytica chimica acta

دوره 879  شماره 

صفحات  -

تاریخ انتشار 2015